Method for mapping a DNA molecule comprising an ad infinitum amplification step

ABSTRACT

The invention relates to a Happy Mapping type method for mapping a DNA molecule comprising a step enabling ad infinitum amplification of the DNA of each panel, consisting of the following: (i) a first amplification with the aid of a primer comprising 10-30 nucleotides defined at extremity 5&#39; thereof, and 5-10 random nucleotides at extremity 3&#39; thereof, in addition to (ii) a second amplification with the aid of a primer comprising at least the oligonucleotides defined at extremity 5&#39; of the primer used in stage i.

This application is based on and claims the benefit of Frenchapplication 99/04388, filed Apr. 8,1999, and International applicationPCT/FR00/00891, filed Apr. 7, 2000. The entire disclosure of theseapplications are relied upon and incorporated by reference herein.

The present invention relates to a method for mapping a DNA molecule ofthe Happy Mapping type which comprises a step for ad infinitumamplification of the DNA of each panel. The present invention alsorelates to a kit for implementing this mapping method and to the use ofthe maps obtained for identifying genes imparting a phenotype ofinterest.

The emergence of a great number of genome sequencing projects andnotably that of the human genome inevitably requires the development ofnovel fast and accurate mapping methods in order to carry out theassembling of raw data from systematic sequencing. Moreover, if thepresent trend of proceeding with shotguns of entire genomes isconfirmed, detailed genomic maps will need to become available.

More than a dozen of microbial genomes have already been fullysequenced, as a result of a direct shotgun sequencing of their genometigr.org/tdb/mdb.mdb.html). However, the latter have sizes from 580 kbto 4 Mb. The use of such an approach for sequencing larger genomes suchas the human genome (3 000 Mb), as suggested by Weber and Myers, 1997,and announced by Venter et al., 1998 does however give rise to certainquestions. Indeed, the large size of these genomes and the presence ofmany repeated sequences makes the assembling of the sequencing resultsdifficult. Thus, it proves necessary to have detailed genomic maps inorder to allow these data to be processed.

The drawing up of detailed genomic maps may also be very helpful forphylogenic studies. Indeed, studies on the evolution of genomes haveclearly shown that the expression of many genes depends on theirlocalization in a certain genetic context. Recent developments ingenomics now allow the evolution of genomes to be studied in more detailon the basis of syntenic relationship changes. With detailed maps,obtained for two species, genetic links which have been maintainedbetween these two species during evolution, may be assessed.

Another field for which the provision of detailed genetic maps would beof fundamental interest, is the localization of QTL (Quantitative TraitLoci). Indeed, most variations within a population or among differentraces, for example, are of a quantitative nature. Certain variationssuch as the size, the weight of individuals, the flowering date forplants or the amount of milk produced in mammals are not included inwell-defined classes according to Mendelian proportions, but rather theyoperate in a continuous manner, according to a gradient from one extremeto the other. These variations preserved in the line of descent, aretherefore transmitted genetically. The loci involved in the variation ofquantitative phenotype traits are called QTL. Detection of links betweena QTL and genetic markers provide a robust method for identifying theseQTL. It is possible to localize a QTL by the so-called “intervalmapping” (Lander et Botstein, 1989) method between two informativemarkers separated by more than 20 cM. However, such an interval makesthe identification of the gene, problematical. Also, it is quite usefulto be able to perform a zoom on the region of interest if a large numberof markers are available, and to perform a fine mapping, in order tolocalize the sought-after gene specifically.

A conceivable mapping approach is the mapping by radiation hybrids (RH,Radiation Hybrid mapping) (Cox et al., 1990; Gyapay et al., 1996). Itconsists in irradiating cell lines, causing chromosomal random breaks.The different fragments of generated chromosomes are then integratedinto the genome of the rodent cells. Thus, it is possible to determinethe distance separating two markers by knowing that the closer they are,the more likely they will be incorporated within a same fragment andtherefore be detected in a same line. However, this approach has certaindrawbacks in addition to its cumbersomeness in the setting-up of a panelof radiation hybrids. Indeed, (i) certain loci which would not becloned, cannot be integrated into a genome map; (ii) the interpretationof results may be confusing when the inserts are rearranged or ligatedwith each other; (iii) the presence of exogenous DNA, in this case theone of the hamster host cell, very often requires that a certain numberof markers be set aside, those giving a positive response with thisexogenous DNA.

With the more flexible Happy Mapping method, the different problems(Dear and Cook, 1993) may be circumvented. With this method, both eventsanalyzed by cross-breedings of formal genetics, i.e. crossing-over andsegregation, may be reproduced in vitro. In practice, crossing-over ismimicked by a random break of DNA into fragments, the size of whichdepends on the sought-after mapping. The markers are then segregated bya random distribution of these fragments into deposits of at least oneequivalent of haploid genome per aliquot, then detected by PCR. Thosewhich are genetically linked, tend to remain together in the samealiquot whereas those which are not linked, are randomly distributed.Their order and the distance which separate them may be inferred fromthe sequence of their co-segregation by a statistical calculation. It isimportant to remind here, that a panel which may be used for Happymapping, is simple to produce as it only requires a few days or even afew weeks. In addition, it may be adapted to any resolution level,according to the size of the selected fragments and may even result inmolecular cloning of fragments of interest for sequencing.

The Happy Mapping method comprises the following steps:

a) Genomic DNA is broken by irradiation,

b) About one equivalent of haploid genome is then placed in each well ofa 96-well plate, which corresponds to about 60% of the initial genomicDNA (statistical distribution of the markers).

c) the DNA is amplified by PCR,

d) and the markers are then detected.

This mapping method has already proved to be reliable for genomes asdifferent in size as the human chromosome 14-100 Mb—(Dear et al., 1998)or that of a parasite protozoan of the intestinal epithelium of manymammals, Cryptosporidium parvum—10 Mb—(Piper et al., 1998). However, forthese two investigations, no satisfactory method was described foramplifying the initial panel, for mapping an unlimited number of markersand any kind of markers. In Dear et al., only a small portion of thetotal DNA, flanked by repeated sequences was able to be mapped. In Piperet al., the amplification level was not sufficient for providing directdetection of the markers by PCR. These authors had to proceed withnested PCR in order to view the markers to be mapped. Further, theamplification method used only allows a limited number of markers to bemapped, requiring a mapping panel to be reconstructed in order tolocalize further markers.

In order that the amplification method may be contemplated for Happymapping, it should meet the following three criteria:

(i) A DNA amount close to one equivalent of haploid genome should besufficient as a matrix;

(ii) the entire genetic information should be amplified;

(iii) the formed panel should be able to be re-amplified ad infinitum inorder to provide mapping of an illimited number of markers.

Thus, the problem consists of amplifying the entire DNA in each well,whereby said amplification should not produce artefacts in the randomdistribution of markers. The objective is the development of an approachfor total homogeneous and ad infinitum amplification of genomic DNA.

The conventional PCR technique has evolved, providing many amplificationmethods each having their own specificity. For example, it is possibleto amplify several sequences simultaneously by using several pairs ofprimers in a same reaction tube, Apostolakos et al., (1993). However,the number of primer pairs rarely exceeds 3. Indeed, above, theamplifications lose their specificity. Other techniques, more or lessderived from PCR, have been developed: LCR, Gap-LCR, ERA, CPR, SDA, TAS,NASBA. However, none of these amplification techniques seems to providean adequate solution for total amplification of DNA.

The T-PCR technique consists of a first amplification step by means ofprimers containing on their 3′ end, random sequences reproducing allpossible combinations, and a defined sequence on their 5′ end. Underthese operating conditions, these oligonucleotides will randomly pair upover the whole length of the sequence and the amplification cycles willprovide incorporation of said defined sequences into all the amplifiedfragments. The second step consists of amplifying the fragments obtainedin the first step by means of a primer including the defined sequence ofthe 5′ end of the primers of the first step, exclusively. This techniquehas been described in U.S. Pat. No. 5,731,117 and Grothues et al., 1993.According to the authors, this method provides amplification of DNAfragments of 400 pb and also of genome fragments which may have up to 40megabases. For a PCR technique to be applicable to Happy Mapping, theamplification should be general, while not introducing any selection(bias) in the portions of amplified DNA. Now, this point has only beentested by hybridization, which is not demonstrative. Moreover, U.S. Pat.No. 5,531,117 shows that total amplification may only be performed if atleast 17 DNA equivalents are available initially. A priori, this showsthat the T-PCR amplification method cannot be used for mapping with theHapping Mapping method, as basically, DNA amounts which only correspondto 1 equivalent should be amplified. Further, the amplification step isdiscussed in U.S. Pat. No. 5,731,117 in order to obtain markers and notfor preparing the substrate on which the markers will be positioned. Thefact that the inventor of the actual Happy Mapping did not retain T-PCR,but rather NESTED-PCR during subsequent development of his technique,proves that the technique as described and tested by hybridization didnot seem to be satisfactory. The solution found within the framework ofthe present invention was to adapt T-PCR to Happy Mapping. The developedmethodology is found to be advantageous for amplifying the entire DNA ineach well without introducing any artefacts. Consequently, thisamplification method represents a technical aid so that Happy Mappingmay be implemented to its full extent.

Thus, the present invention relates to a method for mapping a DNAmolecule, characterized in that it comprises the steps:

a) Breaking the DNA molecule in order to obtain DNA fragments, the sizeof which depends on the selected resolution,

b) distributing said fragments in receptacles in order to have a DNAamount between about 0.5 to 1.5 DNA haploid genome equivalents perreceptacle,

c) amplifying the DNA contained in the receptacles by an amplificationmethod comprising the following steps: i) A first amplification by meansof a primer comprising 10 to 30 defined nucleotides on its 5′ end, and 5to 10 random nucleotides on its 3′ end, and ii) a second amplificationby means of a primer comprising at least the defined oligonucleotide ofthe 5′ end of the primer used in step i),

d) detecting the presence or absence of markers in the receptacles.

Preferably, the primer used in step i) includes 20 defined nucleotidesat its 5′ end, and 6 random nucleotides at its 3′ end. In this case, theprimer used in step ii) may include the 20 defined nucleotides at the 5′end of the primer used in step i).

Quite advantageously, the primer used in step i) corresponds to sequenceSEQ ID NO.1 and the primer used in step ii) corresponds to sequence SEQID NO.2.

This amplification method therefore lies in two complementary steps. Thefirst phase consists of an amplification with only one oligonucleotideas described above and the reaction is carried out in a final volume of30 to 70 μl per microplate well, preferably 50 μl, with 3 to 7 units,preferably 5 units of AmpliTaq polymerase (Perkin Elmer). Any polymeraseequivalent to AmpliTaq may be used with 2 to 6 μM of primer, preferably4 μM.

The PCR reaction may be performed in the following way (@ means “at” or“at about”): 1×[5 mins at 95° C.; 50×(45 secs @92° C. 2 mins @37° C.,37° C.-55° C. 0.1° C./sec. 4 mins @55° C.]: 15×[45 secs @92° C., 1 min@55° C., 3 mins @72° C.]; 1×[5 mins @72° C.]. Of course, any equivalentcycle may be implemented in order to perform the invention.

For the second phase of the reaction, {fraction (1/20)}th to {fraction(1/200)}th, preferably {fraction (1/50)}th of the obtained productduring the first phase is used as a matrix. The PCR reaction may becarried out in a final reaction volume from 5 to 20 μl, preferably 10 μlper microplate-well. Taq DNA polymerase (Promega) may be used with 0,5to 3 μM, preferably 1.5 μM of primer as defined above (primer for stepii)). The PCR reaction may be carried out according to the followingcycle 1×[2 mins @94° C.]: 50×[30 secs @92° C., 45 secs @54° C., 3 mins@72° C.]; 1×[5 mins @72° C.].

Of course, the parameters of this amplification method may be changed oradapted by one skilled in the art, according to the individual case.

Within the scope of the invention, a “DNA molecule”, means a moleculewhich corresponds to a genome, a chromosome, or to a fragment of agenome or a chromosome. In addition, said molecule may be issued from agenome or a chromosome which has possibly undergone changes and/orprocessing.

A preferred embodiment of the invention, consists of extracting the DNAmolecule from cells encapsulated in agarose blocks, then lyzed in orderto release said intact molecule. Said cells may correspond to any cellfrom the plant, animal, or bacterial kingdoms.

The isolated DNA molecule is then broken by γ irradiation, by enzymaticdigestion, notably by the action of endonucleases, such as for examplerestriction enzymes, or by a mechanical action. The obtained DNAfragments may be separated by means of any separation technique known toone skilled in the art, notably by electrophoresis, preferably byelectrophoresis with pulsed fields for obtaining large size fragmentsbefore distribution (step b) in the method described above.

The microtitration plates are advantageously used as a receptacle. Thefragments are thereby distributed into the 96 wells of a microtitrationplate. For this purpose, the fragments are distributed in order to havean amount of DNA per receptacle (preferably per well) of about 1equivalent of haploid genome, i.e. of the order of 2 pg for a mammalgenome, for example.

Thus, the mapping method according to the invention is alsocharacterized in that it comprises a step for amplifying the entiregenetic information contained in the receptacles.

The DNA, amplified in each well, may then be distributed in order toprepare daughter plates. In this case, the markers are detected in thewells of the daughter plates. However, detection of the markers may alsobe directly carried out in the wells of the mother plate but in thiscase only a limited number of markers may be analyzed.

The markers, likely to be present in the wells, may be amplified bymeans of specific primers before the detection step. Detection isusually carried out after electrophoretic migration in a gel appropriateto the fragment size. The detection may also be carried out by means ofprobes specific to the markers. These probes may be capture probes,directly or indirectly fixed on a solid support or probes in the freestate. A “capture probe” is or may be immobilized on a solid support byany appropriate means, for example by covalence, by adsorption or bydirect synthesis on a solid support. These techniques are notablydescribed in Patent Application WO 9210092, incorporated by referenceherein. A “detection probe” may be marked by means of a marker forexample, selected from radioactive isotopes, enzymes, in particularenzymes able to act on a chromogenic, fluorigenic or luminescentsubstrate (notably a peroxidase or an alkaline phosphatase), chromophorechemical compounds, chromogenic, fluorigenic or luminescent compounds,analogues of nucleotidic bases, and ligands such as biotin. Thedetection methods in 96-well microplates are within the capability ofone skilled in the art.

For example, the PCR amplification product is denaturated and hybridizedin a microplate well on which is fixed a capture oligonucleotide,Running (1990), or a single strand DNA containing a capture sequence,Kawai (1993). At least one of the primers used in PCR, for example isbiotinylated and detection of hybridization is performed by addingstreptavidin coupled with an enzyme such as peroxidase, then achromogenic substrate for the enzyme. By using a capture oligonucleotidefixed on the wells, a biotynilated PCR primer and an internal standardfor amplification, even quantitative PCR was made feasible, Berndt(1995). These publications are incorporated by reference herein.

A preferred embodiment of the present invention lies in the detection ofmarkers on DNA chips. The principle of this technique consists inidentifying DNA sequences on the basis of a molecular hybridization. Thechip bears, grafted on an adequate surface, hundreds or thousands ofoligonucleotides of interest or PCR products corresponding to markersfor which a map is desired. The DNA of the wells is denaturated, markedand then placed under the hybridization conditions with the chip. Theadvantage of chips lies in their capability of providing hundreds, oreven thousands of pieces of information from a single DNA sample.Further, the general experimental diagram is very simple and fast.

Another aspect of the invention relates to a kit for mapping a DNAmolecule characterized in that it provides implementation of the methodaccording to the invention. This kit may notably comprise a primercomprising 10 to 30 defined nucleotides at its 5′ end and 5 to 10 randomnucleotides at its 3′ end, and/or a primer comprising at least thedefined oligonucleotides of the 5′ end of the aforementioned primer.Preferably, the kit comprises a primer of sequence SEQ ID NO.1 and/or aprimer of sequence SEQ ID NO.2. The kit mentioned above, is thereforeuseful for preparing panels necessary for drawing up genomic orchromosomal maps.

Another aspect of the invention concerns DNA molecule maps obtained bythe method according to the invention, or by any other equivalentmethod, and the use of said maps for identifying genes imparting aphenotype of interest (notably in plants), for identifying genesresponsible for hereditary diseases, notably in humans, for identifyingquantitative trait loci (QTL). Another aspect concerns the use of mapsaccording to the invention as an aid for reconstructing massive shotgunsfor sequencing a DNA molecule.

Reference will be made to the captions of the figures shown hereafter inthe continuation of the description.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1: Illustration of the method according to the invention. Threerandomly selected markers on the genome are illustrated (A, B and Z).The different Happy mapping steps are detailed in FIG. 1, as follows:

(FIG. 1A) Genomic DNA is prepared from cells encapsulated in agarose,

(FIG. 1B) it is randomly fractionated,

(FIG. 1C) and the fragments are selected at a given size by pulse-fieldelectrophoresis,

(FIG. 1D) next, deposited in a microtitration plate in an amount of 1equivalent of haploid genome per well.

(FIG. 1E) This will form the mapping panel after amplification by twoPCR steps.

(FIG. 1F) Each daughter plate is then used for mapping a marker. PCR iscarried out with specific primers and here the analysis is performed onan agarose gel.

(FIG. 1G) The co-segregation frequency of the markers enables theirrespective position to be determined and a physical map to be drawn up.Calculation of the marker association frequency (LOD score): theassociation frequency of the markers in a same well depends on thedistance which separates them. Two close markers are often or alwayspresent together (case of markers A and B) and two remote markers arerandomly associated (case of markers A and Z or B and Z).

FIGS. 2A and 2B: Results of the mapping according to the invention Theregion of the human chromosome 2 is indicated on the left of FIG. 2A andcontinued in FIG. 2B. The map was generated by associating the markerstwo by two with a LOD score larger than 4. The markers indicated on theright portion of the map were unambiguously positioned with respect toeach other. The “floating” markers located on the left portion of themap exhibit strong association (LOD score larger than 6) with themarkers facing them, but only exhibit low association with adjacentmarkers. The localization of the centromere is indicated by a dottedline.

FIGS. 3A and 3B: Comparison of the mapping results Three maps areillustrated in FIGS. 3A and 3B. The genetic map (left) includes thepolymorphic markers localized by family studies. The map created byradiation hybrids (middle) results from the localization of bothpolymorphic and non-polymorphic markers on the G3 panel (Deloukas etal., 1998). The right map was obtained with the method according to theinvention. The markers positioned on two different maps are illustratedby fine lines joining both maps. The localization of the centromere isindicated by a dotted line.

The present invention therefore provides a novel method for tackling adinfinitum amplification of a small amount of genomic DNA, about 1haploid genome equivalent, i.e. about 2 pg in the case of the humangenome—required for forming a panel which may be used for mapping. Thedeveloped technique achieved in two PCR steps provides amplification ofalmost the entire genetic contents. This approach enables a large numberof markers to be mapped on a portion of the human chromosome 2. Theresults of the mappings carried out by implementing the method accordingto the invention are in agreement with all the data described in theliterature. The method according to the invention is simple to reproduceas it only requires a few days or even a few weeks. In addition, it issuitable for any resolution level, according to the size of thegenerated and selected fragments. It may even result in the molecularcloning of fragments of interest for sequencing.

The most critical step in the described technique is the step for totaland homogeneous amplification of the genome. The solution suggested byPaul Dear, inventor of the Happy Mapping method is the use of primersanchored on repeated elements all along the genome (Alu, LINE) (IRS-PCR,i.e. Interspersed Repetitive Sequence-PCR). This solution has enabled athousand markers to be positioned on the human chromosome 14 (Dear etal., 1998). However, IRS-PCR is limited to amplification and thereforeto the mapping of markers close to the repeated sequences and flanked bythem. This type of amplification is therefore restricted to a smallportion of the genome (as only the sequences flanked by two Alu or LINEsequences are amplified) and is only applied to the human genome or thegenome of primates, unless specific amplification conditions are definedfor each investigated organism. This is why the object of the presentinvention consists in a technique for uniformly amplifying the genomewhatever the genome, without losing any markers. In order to be able touse it for mapping of the Happy Mapping type, the amplification methodshould be both quantitative (it should amplify at a sufficient level asmall quantity of initial DNA corresponding to about one genomeequivalent) and qualitative (no notable loss of genetic information).

As mentioned earlier, various approaches has been considered for solvingthe problem. The first three from the literature (Telenius et al., 1992:Zhang et al. 1992; Grothues et al., 1993), have only afforded fewresults. The method by Telenius et al. (1992) uses a degeneratedoligonucleotide on 6 positions for a PCR amplification achieved in onlyone step (DOP-PCR, i.e. degenerated oligonucleotide primed-PCR). Butthis method only works if a large amount of initial material isavailable (100 nm of human genomic DNA, i.e. about 50,000 haploid genomeequivalents, or 50,000 copies of an isolated chromosome). Now a mappingof the Happy Mapping type is only possible from DNA samples for whichthe amount is close to one haploid genome equivalent. By using theconditions described by Telenius et al., it is not possible to obtainamplification of more than 50% of the initial genetic material. This isin agreement with the results of Cheung and Nelson (1996) which showthat the smaller the amount of genomic DNA used as a matrix, below athreshold of 600 pg—i.e. 300 haploid genome equivalents in the case of amammal genome,—the higher is the risk of not amplifying a locus.

The second method described by Zhang et al. (1992) uses an entirelydegenerated oligonucleotide on its 15 positions (PEP, Primer ExtensionPreamplification). By varying the PCR conditions used by Zhang et al.(number of cycles, hybridization temperature, primer concentration), itseemed possible to amplify nearly 90% of the DNA matrix into a minimumof 200 copies. However, this approach does not provide furtherpreamplification and therefore greatly limits the number of markerswhich may be mapped. Further, the amplification level does not enablethe amplification product of a given marker to be directly viewed butforces the use of two pairs of primers, the second being nested in thefirst, and therefore an amplification is carried out in two steps.

Another approach is carried out in two steps and is based on adegenerated oligonucleotide on the 9 residues located in its portion 3′and including a fixed sequence at 5′ used for the second amplificationstep (T-PCR, Tagged-PCR) (Grothues et al., 1993). However, there is noemphasis here on a qualitative aspect of the amplification. It hastherefore been necessary to determine whether the quantitative andcrucial qualitative parameter would be compatible with Happy mapping. Inaddition, it is clearly shown in U.S. Pat. No. 5,731,171 that the lowerlimit for amplification is 17 genome equivalents, which cannot becontemplated in Happy Mapping. Another significant drawback in themethod described in U.S. Pat. No. 5,731,171 is the risk of contaminationby foreign DNA. Indeed, during the low stringency amplification step,each cycle requires “manual” introduction of DNA polymerase. As thenumber of cycles has to be increased for very small amounts of DNA, therisk of contamination is all the more increased.

All these difficulties have been set aside by the present invention. Theperformed investigations actually have enabled a PCR type method to beused within the framework of Happy Mapping. The first amplification stepshould advantageously comply with parameters such as the primerconcentration, the number of PCR cycles, the hybridization temperature,as well as the length of the degenerated chain at 3′. The selection ofthe fixed sequence at 5′ of the oligonucleotide is important for thesecond amplification step. This second phase is itself also verydependent on the amount of used primer and of the number of achieved PCRcycles.

Mapping Method Comprising an ad Infinitum Amplification Step.

Cells collected from the subject under investigation are encapsulated inagarose blocks (FIG. 1A), and then lyzed in order to release intact DNA.According to the size of the desired fragments, DNA is then broken up,or involuntarily broken up at random (by γ irradiation, enzymaticdigestion or mechanical action) in order to obtain fragments for whichthe size depends on the selected resolution (FIG. 1B). DNA is thencaused to migrate by pulsed field gel electrophoresis (PFGE) in order toseparate fragments with homogeneous sizes (FIG. 1C), which aredistributed in the 96 wells of a microtitration plate in order to onlyhave an amount close to one DNA haploid genome equivalent per well (FIG.1D). In this way, each well has an incomplete representation of thegenome which is specific to it. The entire genetic information containedin each well is then amplified by PCR.

Each well now contains a large amount of DNA so that this material maybe distributed in a great number of daughter plates (FIG. 1E). Hence,each daughter plate will be used for determining the distribution of amarker or of a limited number of markers if detection is achievedthrough electrophoresis, for example by means of PCR by using specificoligonucleotides of the relevant marker (FIG. 1F) as primers. The PCRproducts are then analyzed with conventional electrophoresis on anagarose gel. This operation is repeated for a large number of markers.

If one equivalent of genome is distributed per well, each marker will befound again in 65% of the wells (Poisson's law). Two non-linked markers(separated by a distance larger than the mean size of the aliquotfragments) will be randomly distributed and found together in 42% of thewells (65%×65%). Conversely, two markers separated by a distance lessthan this average size would segregate all the more frequently togetherthat they are close to one another, up to the point of being foundtogether in 65% of the aliquots in the case of very closely linkedmarkers. Each occurrence is recorded and the links between the markersare calculated (lod score), like in conventional genetics, by means ofcomputer software packages such as RHmap (Boehnke et al., 1991) orRHmapper (Slonim et al., 1995). The frequency of association between twomarkers, expressing the physical distance which separates them on thechromosome, then enables a map to be inferred, which groups together allthe associated markers, two by two, within a same link group (FIG. 1G).

EXAMPLE 1 Mapping of Markers on the Human Chromosome 2

Materials and Methods

a) Preparation of the Mapping Panel

Human lymphocytes are sampled and purified on Ficoll (Ficollpack—Pharmacia) according to instructions provided by the supplier.After counting, the cells are encapsulated in agarose blocks(InCert—FMC) in an amount of 10⁵-10⁶ cells/ml of agarose, and then lyzedin situ by 5 successive washes, spread out over 48 hours, in 500 ml of alysis solution (10 mM Tris-HCl pH 7.4; 1 mM EDTA; 1% LiDS) in order torelease the intact DNA. The agarose blocks are then equilibrated in thesame LiDS-free solution. During these steps, the DNA is randomlyfractionated so that the average size of the resulting fragments isclose to 6-7 Mb (results not shown). The DNA is then caused to migrateby electrophoresis with pulsed fields (Pulsed Field Gel Electrophoresis)(PFGE) in a CHEF Mapper (BioRad) [on agarose gel 0,8% (SeaKem Goldagarose—FMC) in TAE IX: migration at 2V/cm in TAE IX for 96 hours @14°C.: field inversion period of 40 mins 23 secs at 106°][3]. After 48hours of migration, the wells as well as the upper 2 mm of gel areremoved in order to avoid any contamination by small DNA fragments whichwould be trapped in the wells and gradually salted out duringelectrophoresis. Upon completion of the migration, the portion of thegel containing the size markers (chromosomes of S. pombe—FMC) is coloredwith BET. The genomic DNA is then sampled from the gel by means of glassmicrocapillaries (Minicaps 10 μm—Hirschmann Laborgerate) and then theseagarose samples are distributed in the 96 wells of a microtitrationplate in order to have only 2 pg of DNA per well i.e., a little lessthan one equivalent of haploid genome [4]. In this way, each well has anincomplete representation of the genome which is specific to it.

The first step consists of determining the amount of DNA present in thewells and adjusting it as required, to about 2 pg for human material.For this analysis, a marker is randomly selected on each humanchromosome in order to determine the amount of DNA to be distributed inthe mother plate. If one starts with one haploid genome equivalent, eachmarker should be present in 65% of the analyzed wells, according toPoisson's statistical law (Dear et al., 1993), which is checked bynested PCR.

b) Random Amplification of Genomic DNA

The DNA content of each well is amplified in two steps by PCR. The firststep consists in an amplification using only one oligonucleotide (T3N65′ AATTAACCCTCACTAAAGGGNNNNNN 3′) (SEQ ID NO.1) The reaction isperformed in a final volume of 50 μl, per well of the microplate,containing 50 mM KC1: 10 mM Tris-HCl pH 8.3; 0.001% (w/v) gelatin; 2.5mMMgCl₂; 200 μM dNTP (Pharmacia); 5 units of AmpliTaq polymerase (PerkinElmer) and 4 μM of T3N6 oligonucleotide. Each test is covered with adrop of mineral oil and the PCR reaction is carried out in athermocycler PTC-200 (MJ Research) 1×[5 mins @95° C.]; 50×[45 secs @92°C., 2 mins @37° C., 37° C. 0.1° C./sec, 4 mins @55° C.]: 15×[45 secs@92° C., 1 min@55° C., 3 mins, 3 mins @72° C.]; 1×[5 mins @72° C.].

For the second phase of the reaction, {fraction (1/50)}th of the firstphase is used as a matrix. The PCR reaction is carried out in a finalreaction volume of 10 μl, per well of the microplate, containing 50 mMKCl; 10 mM Tris-HCl pH 9; 0.1% Triton® X-100; 2.5 mM MgCl₂; 300 μM dNTP(Pharmacia); 2 units of Taq DNA polymerase (Promega) and 1.5 μM of T3oligonucleotide (5′ AATTAACCCTCACTAAAGGG 3′) (SEQ ID NO.2). The PCRreaction is carried out in a thermocycler PTC-200 (MJ Research): 1×[2mins @94° C.]: 50×[30 secs @92° C., 45 secs @54° C., 3 mins @72° C.];1×[5 mins @72° C]. This second phase is carried out four times,independently, and then the four reactions are mixed together.

c) Creation of a Map of a Portion of the Human Chromosome 2

Materials and Methods

Analysis of the marker distribution

EST (Expressed Sequence Tag) type markers as well as a few markers ofthe microsatellite type already mapped were selected in the followingdatabases on the human chromosome 2: ncbi.nlm.nih.gov/genemapncbi.nlm.nih.gov.SCIENCE96/shgc.stanford.edu/Mapping/rh/MapsV2/search2.html

The marker distribution in each daughter plate is analyzed by PCR from 1microliter of the second amplification phase. The reaction mediumcontains 50mM KC1; 10 mM Tris-HCl, pH 9.0; 0.1% Triton X-100; 2.5 mMMgCl₂; 250 μM of each dNTP; 1 μM of both specific upstream anddownstream oligonucleotides of each analyzed marker and 0.5U of Taq DNApolymerase (Promega) in 10 μl. The reaction comprises a firstdenaturation cycle of 1 min at 92° C., followed by 20 cycles of 20 secsat 92° C., 30 secs at 60° C. with a temperature decrement of 0.5°C./cycle and 30 secs at 72° C. and then 10 cycles of 20 secs at 92° C.,30 secs at 50° C., and 30 secs at 72° C. as well as 1 elongationcompletion cycle of 2 mins at 72° C.

After migration on 3% agarose gel, the PCR products are viewed andphotographed on a UV plate equipped with a CCD camera (Bioprint).

Computer Processing

Subsequently, the presence/absence of each marker in the wells of themicroplates is manually entered on the computer as a trinary numberincluding the digits 0 (marker absent), 1 (marker present) or 2(doubtful result), interpretable by conventional mapping softwarepackages. The various link groups are then formed, after associationfrequency calculations for the markers (LOD score), by means of theRH2PT program from the set of RHMAP software packages (Boehnke et al.,1991). The distances separating the markers as well as their order arefinally determined by means of the RHMAXLIK program of this same set.

Results

In a first phase, 190 human markers were selected in the databases on anarea covering about 60 megabases of the human chromosome 2. Each ofthese markers was tested by PCR on the wells of the microplate formingthe mapping panel. The presence/absence of these markers is viewed byPCR followed by electrophoresis and coloration with BET. Among these 190markers, 176 a were integrated into the map (>92%; FIG. 2). The otherones were discarded because of a retention rate much too far away fromthe norm. Indeed, only markers for which the number of positives isp±2(p−[p²/n]) wherein p is the average number of positives per markerand n is the number of aliquots forming the panel, are retained.Statistical analysis of the marker distribution is carried out by meansof the software series RHMap (Boehnke et al., 1991). In a first phase, afirst grouping of markers is performed by a ‘two by two’ analysis byfixing a minimum LOD score threshold of 8 for viewing the associationbetween the markers. The markers were placed with respect to theirneighbors by a multipoint analysis within each link group generated at aLOD score of 8. Subsequently, the groups were positioned relatively toeach other by choosing a LOD score of 4.

Among the 176 markers marked in this way, 133 were placed unambiguouslyrelative to each other (>175%, FIG. 2). However, 43 other markers (leftportion of the map; FIG. 2), showing a sufficiently strong associationwith one or several markers already localized for their being acceptedas real, could not be placed formally because of a lesser associationwith markers located in close vicinity. For this reason, they have beenmarked with floating bars. It should be noted that if the map wasobtained with 176 markers (density of 1 marker/340 kb), similar resultswould have been obtained by only using 75 carefully selected markerswhich would bring the minimum density down to 1 marker/800 kb).

EXAMPLE 2 Comparison of the HAPPY Map with the Other Maps Available inDatabases

In order to validate the map mentioned above, the obtained data werecompared with those available in the computer bases. Presently, a largenumber of data are available from the mapping by radiation hybrids or byfamily studies. The comparison of our results with those obtained withradiation hybrids on the G3 mapping panel which was obtained byirradiating cells at 10,000 rads (Stewart et al., 1997), was selected asa priority. The average size of the obtained fragments is about 4 Mb,which is very close to the size of fragments which we selected forforming our panel (5.5 Mb). 48 common markers were placed on both maps(FIG. 3). The general order of these markers is observed in both maps,as only 4 inversions were revealed. These two by two inversions are onlyobserved on close markers on the genome at a distance estimated to beless than 500 kb to be compared with the panel's resolution limit rh(shgc.stanford.edu) estimated to be 300 kb. In the region immediatelylocated above the centromere, three markers have a relatively divergentposition. This may be correlated with the fact that mapping by radiationhybrids may cause biasses in the resolution of regions surrounding thecentromere. This fact may also explain the differences in the distancesseparating the markers in these regions (FIG. 3). The position of themarkers which we have obtained, is in good agreement with the geneticmap available in the databases (FIG. 3).

REFERENCES

Apostolakos (1993) Anal. Biochem. 213, 277-284

Berndt (1995) Anal. Biochem. 225, 252-257

Boehnke M et al. (1991) Statistical methods for multipoint radiationhybrid Mapping. Am. J. Hum. Genet., 49, 1174-1188

Cheung V G and Nelson S F (1996) Whole genome amplification using adegenerate oligonucleotide primer allows hundreds of genotypes to beperformed on less than one nanogram of genomic DNA. Proc. Natl. Acad.Sci. USA, 93, 14676-14679

Cox D R et al. (1990) Radiation hybrid Mapping: a somatic cell geneticmethod for constructing high-resolution maps of mammalian chromosomes.Science, 250, 245-250

Dear P H and Cook P (1993) Happy Mapping: linkage Mapping using aphysical analogue of meiosis. Nucl. Acids Res., 21, 13-?0

Dear P H et al. (1998) A high resolution metric HAPPY map of humanchromosome 14. Genomics 48, 232-241

Deloukas P et al. (1998) A Physical Map of 30,000 Human Genes. Science,282, 744-746

Grothues D et al. (1993) PCR amplification of megabase DNA with taggedrandom primers (T-PCR). Nucleic Acids Res. 21, 1321-1322

Gyapay G et al. (1996) A radiation hybrid map of the human genome. Hum.Mol. Genet,5, 339-346

Kawai (1993) Anal. Biochem. 209, 63-69

Lander E S and Botstein D B (1989) Mapping mendelian factors underlyingquantitative traits using RFLP linkage maps. Genetics. 121. 185-199

Piper M B et al., (1998) A HAPPY map of Crypto sporidium parvum. Genomeresearch, 8, 1299-1307

Running (1990) Biotechniques, 8, 276-277

Slonim D et al. (1995) RHMAPPER: An interactive computer package forconstructing radiation hybrid maps. Cold Spring Harbor Meeting on GenomeMapping and Sequencing. May 10-14

Stewart E A et al. (1997) An STS-based radiation hybrid map of the humangenome. Genome Res., 7, 422-433

Telenius H et al. (1992) Degenerate Oligo-Primed PCR: Generalamplification of target DNA by a single degenerate primer. Genomics, 13,718-725

Venter et al. (1998) Science compass June 5^(th), p. 1540.

Weber J L and Myers E W (1997) Human whole-genome shotgun sequencing.Genome Res., 7, 401-409

Zhang L et al. (1992) Whole genome amplification from a single cell:Implications for genetic analysis. Proc. Natl. Acad. Sci. USA, 89,5847-5851

2 1 26 DNA Artificial Sequence Primer i 1 aattaaccct cactaaaggg nnnnnn26 2 20 DNA Artificial Sequence Primer ii 2 aattaaccct cactaaaggg 20

What is claimed is:
 1. A method for mapping a DNA molecule comprisingthe steps: (A) breaking the DNA molecule in order to obtain DNAfragments, wherein the size of the fragments depends on a selectedresolution, (B) distributing said fragments in receptacles in order tohave an amount of DNA between about 0.5 to 1.5 DNA haploid genomeequivalent per receptacle, (C) amplifying the DNA contained in thereceptacles by an amplification method comprising the following steps:i) a first amplification by means of a primer comprising 10 to 30defined nucleotides at its 5′ end and 5 to 10 random nucleotides at its3′ end, ii) a second amplification by means of a primer comprising atleast the defined oligonucleotides of the 5′ end of the primer used instep i), (D) detecting the presence or absence of markers in thereceptacles (E) calculating an order of the markers and a distancebetween the markers to obtain a map of the DNA molecule.
 2. The methodas claimed in claim 1, wherein the primer used in step i) comprises 20defined nucleotides at its 5′ end and 6 random nucleotides at its 3′end.
 3. The method as claimed in claim 2, wherein the primer used instep i) has the sequence of SEQ ID NO.
 1. 4. The method as claimed inclaim 2, wherein the primer used in step ii) comprises the 20 definednucleotides at the 5′ end of the primer used in step i).
 5. The methodas claimed in claim 4, wherein the primer used in step ii) has thesequence of SEQ ID NO.
 2. 6. The method as claimed in claim 1 whereinthe DNA molecule is a genome, a chromosome, or a genome or chromosomefragment or is derived from a genome, a chromosome, or a genome orchromosome fragment.
 7. The method as claimed in claim 6, wherein theDNA molecule is within a cell and the cell is encapsulated in agaroseblocks and lysed to release the DNA molecule intact.
 8. The method asclaimed in claim 7, wherein the cells are plant or animal cells.
 9. Themethod as claimed in claim 1 wherein the DNA molecule is broken by atleast one of γ irradiation, enzymatic digestion, and mechanical action.10. The method as claimed in claim 1 further comprising a step beforethe distribution step (B) wherein the DNA fragments are separated bypulsed field gel electrophoresis to obtain fragments of homogeneoussizes.
 11. The method as claimed in claim 1, wherein the fragments aredistributed in wells of a 96-, 182- or 384-well microtitration plate.12. The method as claimed in claim 11, wherein the entire geneticinformation contained in the wells is amplified.
 13. The method asclaimed in claim 11, wherein the amplified DNA in each well isdistributed to allow preparation of daughter plates.
 14. The method asclaimed in claim 13, wherein the markers are detected in wells of thedaughter plates.
 15. The method as claimed in claim 11, wherein themarkers present in the wells are amplified by means of specific primersbefore the detection step.
 16. The method as claimed in claim 10,wherein the detection is carried out directly after electrophoresis orwith probes specific to the markers.
 17. The method as claimed in claim16, wherein the probes are capture probes, which are directly orindirectly immobilized on a solid support.
 18. The method as claimed inclaim 16 wherein the probes are marked by means of a marker includingradioactive isotopes, enzymes capable of acting on a chromogenic,fluorigenic or luminescent substrate, and chromophore chemicalcompounds, chromogenic, fluorigenic and luminescent compounds.
 19. Themethod as claimed in claim 1 wherein the detection is performed on DNAchips.
 20. The method as claimed in claim 1 wherein the fragments aredistributed in with 1 haploid genome equivalent per well in step (B).